Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 100
Filter
1.
Diabetes Care ; 47(5): 826-834, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38498185

ABSTRACT

OBJECTIVE: To explore associations of HLA class II genes (HLAII) with the progression of islet autoimmunity from asymptomatic to symptomatic type 1 diabetes (T1D). RESEARCH DESIGN AND METHODS: Next-generation targeted sequencing was used to genotype eight HLAII genes (DQA1, DQB1, DRB1, DRB3, DRB4, DRB5, DPA1, DPB1) in 1,216 participants from the Diabetes Prevention Trial-1 and Randomized Diabetes Prevention Trial with Oral Insulin sponsored by TrialNet. By the linkage disequilibrium, DQA1 and DQB1 are haplotyped to form DQ haplotypes; DP and DR haplotypes are similarly constructed. Together with available clinical covariables, we applied the Cox regression model to assess HLAII immunogenic associations with the disease progression. RESULTS: First, the current investigation updated the previously reported genetic associations of DQA1*03:01-DQB1*03:02 (hazard ratio [HR] = 1.25, P = 3.50*10-3) and DQA1*03:03-DQB1*03:01 (HR = 0.56, P = 1.16*10-3), and also uncovered a risk association with DQA1*05:01-DQB1*02:01 (HR = 1.19, P = 0.041). Second, after adjusting for DQ, DPA1*02:01-DPB1*11:01 and DPA1*01:03-DPB1*03:01 were found to have opposite associations with progression (HR = 1.98 and 0.70, P = 0.021 and 6.16*10-3, respectively). Third, DRB1*03:01-DRB3*01:01 and DRB1*03:01-DRB3*02:02, sharing the DRB1*03:01, had opposite associations (HR = 0.73 and 1.44, P = 0.04 and 0.019, respectively), indicating a role of DRB3. Meanwhile, DRB1*12:01-DRB3*02:02 and DRB1*01:03 alone were found to associate with progression (HR = 2.6 and 2.32, P = 0.018 and 0.039, respectively). Fourth, through enumerating all heterodimers, it was found that both DQ and DP could exhibit associations with disease progression. CONCLUSIONS: These results suggest that HLAII polymorphisms influence progression from islet autoimmunity to T1D among at-risk subjects with islet autoantibodies.


Subject(s)
Diabetes Mellitus, Type 1 , Humans , Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 1/prevention & control , Seroconversion , Genotype , Haplotypes , Disease Progression , HLA-DRB1 Chains/genetics , HLA-DQ beta-Chains/genetics , Alleles , Gene Frequency
2.
JAMA Netw Open ; 6(2): e230191, 2023 02 01.
Article in English | MEDLINE | ID: mdl-36809468

ABSTRACT

Importance: Earlier detection of emerging novel SARS-COV-2 variants is important for public health surveillance of potential viral threats and for earlier prevention research. Artificial intelligence may facilitate early detection of SARS-CoV2 emerging novel variants based on variant-specific mutation haplotypes and, in turn, be associated with enhanced implementation of risk-stratified public health prevention strategies. Objective: To develop a haplotype-based artificial intelligence (HAI) model for identifying novel variants, including mixture variants (MVs) of known variants and new variants with novel mutations. Design, Setting, and Participants: This cross-sectional study used serially observed viral genomic sequences globally (prior to March 14, 2022) to train and validate the HAI model and used it to identify variants arising from a prospective set of viruses from March 15 to May 18, 2022. Main Outcomes and Measures: Viral sequences, collection dates, and locations were subjected to statistical learning analysis to estimate variant-specific core mutations and haplotype frequencies, which were then used to construct an HAI model to identify novel variants. Results: Through training on more than 5 million viral sequences, an HAI model was built, and its identification performance was validated on an independent validation set of more than 5 million viruses. Its identification performance was assessed on a prospective set of 344 901 viruses. In addition to achieving an accuracy of 92.8% (95% CI within 0.1%), the HAI model identified 4 Omicron MVs (Omicron-Alpha, Omicron-Delta, Omicron-Epsilon, and Omicron-Zeta), 2 Delta MVs (Delta-Kappa and Delta-Zeta), and 1 Alpha-Epsilon MV, among which Omicron-Epsilon MVs were most frequent (609/657 MVs [92.7%]). Furthermore, the HAI model found that 1699 Omicron viruses had unidentifiable variants given that these variants acquired novel mutations. Lastly, 524 variant-unassigned and variant-unidentifiable viruses carried 16 novel mutations, 8 of which were increasing in prevalence percentages as of May 2022. Conclusions and Relevance: In this cross-sectional study, an HAI model found SARS-COV-2 viruses with MV or novel mutations in the global population, which may require closer examination and monitoring. These results suggest that HAI may complement phylogenic variant assignment, providing additional insights into emerging novel variants in the population.


Subject(s)
Artificial Intelligence , COVID-19 , Humans , Cross-Sectional Studies , Haplotypes , Prospective Studies , RNA, Viral , SARS-CoV-2 , Mutation
3.
Sci Rep ; 12(1): 19089, 2022 11 09.
Article in English | MEDLINE | ID: mdl-36352021

ABSTRACT

Extensive mutations in the Omicron spike protein appear to accelerate the transmission of SARS-CoV-2, and rapid infections increase the odds that additional mutants will emerge. To build an investigative framework, we have applied an unsupervised machine learning approach to 4296 Omicron viral genomes collected and deposited to GISAID as of December 14, 2021, and have identified a core haplotype of 28 polymutants (A67V, T95I, G339D, R346K, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H, T547K, D614G, H655Y, N679K, P681H, N764K, K796Y, N856K, Q954H, N69K, L981F) in the spike protein and a separate core haplotype of 17 polymutants in non-spike genes: (K38, A1892) in nsp3, T492 in nsp4, (P132, V247, T280, S284) in 3C-like proteinase, I189 in nsp6, P323 in RNA-dependent RNA polymerase, I42 in Exonuclease, T9 in envelope protein, (D3, Q19, A63) in membrane glycoprotein, and (P13, R203, G204) in nucleocapsid phosphoprotein. Using these core haplotypes as reference, we have identified four newly emerging polymutants (R346, A701, I1081, N1192) in the spike protein (p value = 9.37*10-4, 1.0*10-15, 4.76*10-7 and 1.56*10-4, respectively), and five additional polymutants in non-spike genes (D343G in nucleocapsid phosphoprotein, V1069I in nsp3, V94A in nsp4, F694Y in the RNA-dependent RNA polymerase and L106L/F of ORF3a) that exhibit significant increasing trajectories (all p values < 1.0*10-15). In the absence of relevant clinical data for these newly emerging mutations, it is important to monitor them closely. Two emerging mutations may be of particular concern: the N1192S mutation in spike protein locates in an extremely highly conserved region of all human coronaviruses that is integral to the viral fusion process, and the F694Y mutation in the RNA polymerase may induce conformational changes that could impact remdesivir binding.


Subject(s)
COVID-19 , Spike Glycoprotein, Coronavirus , Humans , Spike Glycoprotein, Coronavirus/genetics , Unsupervised Machine Learning , SARS-CoV-2/genetics , COVID-19/epidemiology , COVID-19/genetics , RNA-Dependent RNA Polymerase , Mutation , Phosphoproteins/genetics
4.
JAMA Netw Open ; 5(9): e2230293, 2022 09 01.
Article in English | MEDLINE | ID: mdl-36069983

ABSTRACT

Importance: With timely collection of SARS-CoV-2 viral genome sequences, it is important to apply efficient data analytics to detect emerging variants at the earliest time. Objective: To evaluate the application of a statistical learning strategy (SLS) to improve early detection of novel SARS-CoV-2 variants using viral sequence data from global surveillance. Design, Setting, and Participants: This case series applied an SLS to viral genomic sequence data collected from 63 686 individuals in Africa and 531 827 individuals in the United States with SARS-CoV-2. Data were collected from January 1, 2020, to December 28, 2021. Main Outcomes and Measures: The outcome was an indicator of Omicron variant derived from viral sequences. Centering on a temporally collected outcome, the SLS used the generalized additive model to estimate locally averaged Omicron caseload percentages (OCPs) over time to characterize Omicron expansion and to estimate when OCP exceeded 10%, 25%, 50%, and 75% of the caseload. Additionally, an unsupervised learning technique was applied to visualize Omicron expansions, and temporal and spatial distributions of Omicron cases were investigated. Results: In total, there were 2698 cases of Omicron in Africa and 12 141 in the United States. The SLS found that Omicron was detectable in South Africa as early as December 31, 2020. With 10% OCP as a threshold, it may have been possible to declare Omicron a variant of concern as early as November 4, 2021, in South Africa. In the United States, the application of SLS suggested that the first case was detectable on November 21, 2021. Conclusions and Relevance: The application of SLS demonstrates how the Omicron variant may have emerged and expanded in Africa and the United States. Earlier detection could help the global effort in disease prevention and control. To optimize early detection, efficient data analytics, such as SLS, could assist in the rapid identification of new variants as soon as they emerge, with or without lineages designated, using viral sequence data from global surveillance.


Subject(s)
COVID-19 , SARS-CoV-2 , COVID-19/epidemiology , Genome, Viral/genetics , Humans , Mutation , SARS-CoV-2/genetics , South Africa , United States/epidemiology
5.
Int J Immunogenet ; 49(5): 333-339, 2022 Oct.
Article in English | MEDLINE | ID: mdl-35959717

ABSTRACT

Multiple sclerosis (MS) is a chronic neurological disease believed to be caused by autoimmune pathogenesis. The aetiology is likely explained by a complex interplay between inherited and environmental factors. Genetic investigations into MS have been conducted for over 50 years, yielding >100 associations to date. Globally, the strongest linkage is with the human leukocyte antigen (HLA) HLA-DRB5*01:01:01-DRB1*15:01:01-DQA1*01:02:01-DQB1*06:02:01 haplotype. Here, high-resolution sequencing of HLA was used to determine the alleles of DRB3, DRB4, DRB5, DRB1, DQA1, DQB1, DPA1 and DPB1 as well as their extended haplotypes and genotypes in 100 Swedish MS patients. Results were compared to 636 population controls. The heterogeneity in HLA associations with MS was demonstrated; among 100 patients, 69 extended HLA-DR-DQ genotypes were found. Three extended HLA-DR-DQ genotypes were found to be correlated to MS; HLA-DRB5*01:01:01-DRB1*15:01:01-DQA1*01:02:01-DQB1*06:02:01 haplotype together with (A) HLA-DRB4*01:01:01//DRB4*01:01:01:01-DRB1*07:01:01-DQA1*02:01//02:01:01-DQB1*02:02:01, (B) HLA-DRBX*null-DRB1*08:01:01-DQA1*04:01:01-DQB1*04:02:01, and (C) HLA-DRB3*01:01:02-DRB1*03:01:01-DQA1*05:01:01-DQB1*02:01:01. At the allelic level, HLA-DRB3*01:01:02 was considered protective against MS. However, when combined with HLA-DRB3*01:01:02-DRB1*03:01:01-DQA1*05:01:01-DQB1*02:01:01, this extended haplotype was considered a predisposing risk factor. This highlights the limitations as included with investigations of single alleles relative to those of extended haplotypes/genotypes. In conclusion, with 69 genotypes presented among 100 patients, high-resolution sequencing was conducted to underscore the wide polymorphisms present among MS patients. Additional studies in larger cohorts will be of importance to define MS among the patient group not associated with HLA-DRB5*01:01:01-DRB1*15:01:01-DQA1*01:02:01-DQB1*06:02:01.


Subject(s)
Multiple Sclerosis , HLA Antigens , HLA-DQ alpha-Chains/genetics , HLA-DQ beta-Chains/genetics , HLA-DRB1 Chains/genetics , HLA-DRB3 Chains/genetics , HLA-DRB5 Chains/genetics , Haplotypes , Humans , Multiple Sclerosis/genetics , Sweden
6.
Diabetes Care ; 45(7): 1610-1620, 2022 07 07.
Article in English | MEDLINE | ID: mdl-35621697

ABSTRACT

OBJECTIVE: The purpose was to test the hypothesis that the HLA-DQαß heterodimer structure is related to the progression of islet autoimmunity from asymptomatic to symptomatic type 1 diabetes (T1D). RESEARCH DESIGN AND METHODS: Next-generation targeted sequencing was used to genotype HLA-DQA1-B1 class II genes in 670 subjects in the Diabetes Prevention Trial-Type 1 (DPT-1). Coding sequences were translated into DQ α- and ß-chain amino acid residues and used in hierarchically organized haplotype (HOH) association analysis to identify motifs associated with diabetes onset. RESULTS: The opposite diabetes risks were confirmed for HLA DQA1*03:01-B1*03:02 (hazard ratio [HR] 1.36; P = 2.01 ∗ 10-3) and DQA1*03:03-B1*03:01 (HR 0.62; P = 0.037). The HOH analysis uncovered residue -18ß in the signal peptide and ß57 in the ß-chain to form six motifs. DQ*VA was associated with faster (HR 1.49; P = 6.36 ∗ 10-4) and DQ*AD with slower (HR 0.64; P = 0.020) progression to diabetes onset. VA/VA, representing DQA1*03:01-B1*03:02 (DQ8/8), had a greater HR of 1.98 (P = 2.80 ∗ 10-3). The DQ*VA motif was associated with both islet cell antibodies (P = 0.023) and insulin autoantibodies (IAAs) (P = 3.34 ∗ 10-3), while the DQ*AD motif was associated with a decreased IAA frequency (P = 0.015). Subjects with DQ*VA and DQ*AD experienced, respectively, increasing and decreasing trends of HbA1c levels throughout the follow-up. CONCLUSIONS: HLA-DQ structural motifs appear to modulate progression from islet autoimmunity to diabetes among at-risk relatives with islet autoantibodies. Residue -18ß within the signal peptide may be related to levels of protein synthesis and ß57 to stability of the peptide-DQab trimolecular complex.


Subject(s)
Diabetes Mellitus, Type 1 , Islets of Langerhans , Autoantibodies , Autoimmunity/genetics , Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 1/prevention & control , Genetic Predisposition to Disease , HLA-DQ Antigens/genetics , HLA-DQ alpha-Chains/genetics , HLA-DQ beta-Chains/genetics , Haplotypes , Humans , Protein Sorting Signals/genetics
7.
Res Sq ; 2022 Feb 25.
Article in English | MEDLINE | ID: mdl-35233566

ABSTRACT

Extensive mutations in the Omicron spike protein appear to accelerate the transmission of SARS-CoV-2, and rapid infections increase the odds that additional mutants will emerge. To build an investigative framework, we have applied an unsupervised machine learning approach to 4296 Omicron viral genomes collected and deposited to GISAID as of December 14, 2021, and have identified a core haplotype of 28 polymutants (A67V, T95I, G339D, R346K, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H, T547K, D614G, H655Y, N679K, P681H, N764K, K796Y, N856K, Q954H, N69K, L981F) in the spike protein and a separate core haplotype of 17 polymutants in non-spike genes: (K38, A1892) in nsp3, T492 in nsp4, (P132, V247, T280, S284) in 3C-like proteinase, I189 in nsp6, P323 in RNA-dependent RNA polymerase, I42 in Exonuclease, T9 in envelope protein, (D3, Q19, A63) in membrane glycoprotein, and (P13, R203, G204) in nucleocapsid phosphoprotein. Using these core haplotypes as reference, we have identified four newly emerging polymutants (R346, A701, I1081, N1192) in the spike protein (p-value=9.37*10 -4 , 1.0*10 -15 , 4.76*10 -7 and 1.56*10 -4 , respectively), and five additional polymutants in non-spike genes (D343G in nucleocapsid phosphoprotein, V1069I in nsp3, V94A in nsp4, F694Y in the RNA-dependent RNA polymerase and L106L/F of ORF3a) that exhibit significant increasing trajectories (all p-values < 1.0*10 -15 ). In the absence of relevant clinical data for these newly emerging mutations, it is important to monitor them closely. Two emerging mutations may be of particular concern: the N1192S mutation in spike protein locates in an extremely highly conserved region of all human coronaviruses that is integral to the viral fusion process, and the F694Y mutation in the RNA polymerase may induce conformational changes that could impact Remdesivir binding.

9.
Sci Rep ; 12(1): 1206, 2022 01 24.
Article in English | MEDLINE | ID: mdl-35075180

ABSTRACT

SARS-CoV-2 is spreading worldwide with continuously evolving variants, some of which occur in the Spike protein and appear to increase viral transmissibility. However, variants that cause severe COVID-19 or lead to other breakthroughs have not been well characterized. To discover such viral variants, we assembled a cohort of 683 COVID-19 patients; 388 inpatients ("cases") and 295 outpatients ("controls") from April to August 2020 using electronically captured COVID test request forms and sequenced their viral genomes. To improve the analytical power, we accessed 7137 viral sequences in Washington State to filter out viral single nucleotide variants (SNVs) that did not have significant expansions over the collection period. Applying this filter led to the identification of 53 SNVs that were statistically significant, of which 13 SNVs each had 3 or more variant copies in the discovery cohort. Correlating these selected SNVs with case/control status, eight SNVs were found to significantly associate with inpatient status (q-values < 0.01). Using temporal synchrony, we identified a four SNV-haplotype (t19839-g28881-g28882-g28883) that was significantly associated with case/control status (Fisher's exact p = 2.84 × 10-11). This haplotype appeared in April 2020, peaked in June, and persisted into January 2021. The association was replicated (OR = 5.46, p-value = 4.71 × 10-12) in an independent cohort of 964 COVID-19 patients (June 1, 2020 to March 31, 2021). The haplotype included a synonymous change N73N in endoRNase, and three non-synonymous changes coding residues R203K, R203S and G204R in the nucleocapsid protein. This discovery points to the potential functional role of the nucleocapsid protein in triggering "cytokine storms" and severe COVID-19 that led to hospitalization. The study further emphasizes a need for tracking and analyzing viral sequences in correlations with clinical status.


Subject(s)
COVID-19 , Haplotypes , Hospitalization , Mutation , SARS-CoV-2/genetics , COVID-19/epidemiology , COVID-19/genetics , COVID-19/therapy , Female , Humans , Male , Washington/epidemiology
10.
bioRxiv ; 2021 Jun 15.
Article in English | MEDLINE | ID: mdl-34159336

ABSTRACT

The emergence and establishment of SARS-CoV-2 variants of interest (VOI) and variants of concern (VOC) highlight the importance of genomic surveillance. We propose a statistical learning strategy (SLS) for identifying and spatiotemporally tracking potentially relevant Spike protein mutations. We analyzed 167,893 Spike protein sequences from US COVID-19 cases (excluding 21,391 sequences from VOI/VOC strains) deposited at GISAID from January 19, 2020 to March 15, 2021. Alignment against the reference Spike protein sequence led to the identification of viral residue variants (VRVs), i.e., residues harboring a substitution compared to the reference strain. Next, generalized additive models were applied to model VRV temporal dynamics, to identify VRVs with significant and substantial dynamics (false discovery rate q-value <0.01; maximum VRV proportion > 10% on at least one day). Unsupervised learning was then applied to hierarchically organize VRVs by spatiotemporal patterns and identify VRV-haplotypes. Finally, homology modelling was performed to gain insight into potential impact of VRVs on Spike protein structure. We identified 90 VRVs, 71 of which have not previously been observed in a VOI/VOC, and 35 of which have emerged recently and are durably present. Our analysis identifies 17 VRVs ∼91 days earlier than their first corresponding VOI/VOC publication. Unsupervised learning revealed eight VRV-haplotypes of 4 VRVs or more, suggesting two emerging strains (B1.1.222 and B.1.234). Structural modeling supported potential functional impact of the D1118H and L452R mutations. The SLS approach equally monitors all Spike residues over time, independently of existing phylogenic classifications, and is complementary to existing genomic surveillance methods.

11.
EBioMedicine ; 69: 103431, 2021 Jul.
Article in English | MEDLINE | ID: mdl-34153873

ABSTRACT

BACKGROUND: HLA-DR4, a common antigen of HLA-DRB1, has multiple subtypes that are strongly associated with risk of type 1 diabetes (T1D); however, some are risk neutral or resistant. The pathobiological mechanism of HLA-DR4 subtypes remains to be elucidated. METHODS: We used a population-based case-control study of T1D (962 patients and 636 controls) to decipher genetic associations of HLA-DR4 subtypes and specific residues with susceptibility to T1D. Using a birth cohort of 7865 children with periodically measured islet autoantibodies (GADA, IAA or IA-2A), we proposed to validate discovered genetic associations with a totally different study design and time-to-seroconversions prior to clinical onset of T1D. A novel analytic strategy hierarchically organized the HLA-DRB1 alleles by sequence similarity and identified critical amino acid residues by minimizing local genomic architecture and higher-order interactions. FINDINGS: Three amino acid residues of HLA-DRB1 (ß71, ß74, ß86) were found to be predictive of T1D risk in the population-based study. The "KAG" motif, corresponding to HLA-DRB1×04:01, was most strongly associated with T1D risk ([O]dds [R]atio=3.64, p = 3.19 × 10-64). Three less frequent motifs ("EAV", OR = 2.55, p = 0.025; "RAG", OR = 1.93, p = 0.043; and "RAV", OR = 1.56, p = 0.003) were associated with T1D risk, while two motifs ("REG" and "REV") were equally protective (OR = 0.11, p = 4.23 × 10-4). In an independent birth cohort of HLA-DR3 and HLA-DR4 subjects, those having the "KAG" motif had increased risk for time-to-seroconversion (Hazard Ratio = 1.74, p = 6.51 × 10-14) after adjusting potential confounders. INTERPRETATIONS: DNA sequence variation in HLA-DRB1 at positions ß71, ß74, and ß86 are non-conservative (ß74 A→E, ß71 E vs K vs R and ß86 G vs V). They result in substantial differences in peptide antigen anchor pocket preferences at p1, p4 and potentially neighboring regions such as pocket p7. Differential peptide antigen binding is likely to be affected. These sequence substitutions may account for most of the HLA-DR4 contribution to T1D risk as illustrated in two HLA-peptide model complexes of the T1D autoantigens preproinsulin and GAD65. FUNDING: National Institute of Diabetes and Digestive and Kidney Diseases and the Swedish Child Diabetes Foundation and the Swedish Research Council.


Subject(s)
Diabetes Mellitus, Type 1/genetics , HLA-DRB1 Chains/genetics , Seroconversion , Amino Acid Motifs , Child , Child, Preschool , Diabetes Mellitus, Type 1/immunology , Female , HLA-DRB1 Chains/chemistry , HLA-DRB1 Chains/immunology , Humans , Infant , Male
12.
Sci Rep ; 11(1): 8821, 2021 04 23.
Article in English | MEDLINE | ID: mdl-33893332

ABSTRACT

HLA-DQ molecules account over 50% genetic risk of type 1 diabetes (T1D), but little is known about associated residues. Through next generation targeted sequencing technology and deep learning of DQ residue sequences, the aim was to uncover critical residues and their motifs associated with T1D. Our analysis uncovered (αa1, α44, α157, α196) and (ß9, ß30, ß57, ß70, ß135) on the HLA-DQ molecule. Their motifs captured all known susceptibility and resistant T1D associations. Three motifs, "DCAA-YSARD" (OR = 2.10, p = 1.96*10-20), "DQAA-YYARD" (OR = 3.34, 2.69*10-72) and "DQDA-YYARD" (OR = 3.71, 1.53*10-6) corresponding to DQ2.5 and DQ8.1 (the latter two motifs) associated with susceptibility. Ten motifs were significantly associated with resistance to T1D. Collectively, homozygous DQ risk motifs accounted for 43% of DQ-T1D risk, while homozygous DQ resistant motifs accounted for 25% protection to DQ-T1D risk. Of the identified nine residues five were within or near anchoring pockets of the antigenic peptide (α44, ß9, ß30, ß57 and ß70), one was the N-terminal of the alpha chain (αa1), one in the CD4-binding region (ß135), one in the putative cognate TCR-induced αß homodimerization process (α157), and one in the intra-membrane domain of the alpha chain (α196). Finding these critical residues should allow investigations of fundamental properties of host immunity that underlie tolerance to self and organ-specific autoimmunity.


Subject(s)
Amino Acids/genetics , Diabetes Mellitus, Type 1/immunology , Disease Susceptibility/immunology , HLA-DQ Antigens/genetics , Amino Acids/chemistry , Case-Control Studies , Child , Child, Preschool , Diabetes Mellitus, Type 1/genetics , Gene Frequency , HLA-DQ Antigens/chemistry , Haplotypes , Humans , Risk Factors , Sweden
13.
J Genet Couns ; 30(6): 1591-1597, 2021 12.
Article in English | MEDLINE | ID: mdl-33881185

ABSTRACT

Our work evaluates the contributions of a genetics clinic visit in assessing patients' risk of hereditary cancers and in meeting National Cancer Comprehensive Network (NCCN) criteria for genetic testing. We reviewed the electronic health records (EHR) of 56 women seen for medical care in our healthcare system who were subsequently seen in the Adult Genetics Clinic. We searched for all personal or family cancer history available in either free-text or structured form within the EHR prior to the genetics visit. For each patient, we then compared the aggregate data with the pedigree information obtained at the Genetics Clinic visit for first-, second-, and third-degree relatives. During the genetics clinic visit, the number of relatives with cancer diagnoses doubled from 121 to 235, and for 17 of 56 (30%) of patients, family histories changed one or more NCCN criteria. For 39/56 (70%) of patients, the family history in the EHR was not changed during the genetics clinic visit. Of 56 women referred to the genetics clinic, 45 (80%) met NCCN guidelines for testing, 40 women underwent genetic testing, and 9 of 40 (23%) tested were positive for a Likely Pathogenic or Pathogenic (LP/P) variant. This study of 56 women quantitatively demonstrates the value of a genetics clinic visit by improved identification of key family history components.


Subject(s)
Breast Neoplasms , Ovarian Neoplasms , Adult , Breast Neoplasms/diagnosis , Breast Neoplasms/genetics , Carcinoma, Ovarian Epithelial/genetics , Female , Genetic Predisposition to Disease , Genetic Testing , Humans , Ovarian Neoplasms/genetics , Pedigree
14.
Viruses ; 14(1)2021 12 21.
Article in English | MEDLINE | ID: mdl-35062214

ABSTRACT

The emergence and establishment of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of interest (VOIs) and variants of concern (VOCs) highlight the importance of genomic surveillance. We propose a statistical learning strategy (SLS) for identifying and spatiotemporally tracking potentially relevant Spike protein mutations. We analyzed 167,893 Spike protein sequences from coronavirus disease 2019 (COVID-19) cases in the United States (excluding 21,391 sequences from VOI/VOC strains) deposited at GISAID from 19 January 2020 to 15 March 2021. Alignment against the reference Spike protein sequence led to the identification of viral residue variants (VRVs), i.e., residues harboring a substitution compared to the reference strain. Next, generalized additive models were applied to model VRV temporal dynamics and to identify VRVs with significant and substantial dynamics (false discovery rate q-value < 0.01; maximum VRV proportion >10% on at least one day). Unsupervised learning was then applied to hierarchically organize VRVs by spatiotemporal patterns and identify VRV-haplotypes. Finally, homology modeling was performed to gain insight into the potential impact of VRVs on Spike protein structure. We identified 90 VRVs, 71 of which had not previously been observed in a VOI/VOC, and 35 of which have emerged recently and are durably present. Our analysis identified 17 VRVs ~91 days earlier than their first corresponding VOI/VOC publication. Unsupervised learning revealed eight VRV-haplotypes of four VRVs or more, suggesting two emerging strains (B1.1.222 and B.1.234). Structural modeling supported a potential functional impact of the D1118H and L452R mutations. The SLS approach equally monitors all Spike residues over time, independently of existing phylogenic classifications, and is complementary to existing genomic surveillance methods.


Subject(s)
COVID-19/virology , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/genetics , Amino Acid Sequence , COVID-19/epidemiology , Haplotypes , Humans , Models, Molecular , Models, Statistical , Mutation , SARS-CoV-2/classification , SARS-CoV-2/isolation & purification , Spatio-Temporal Analysis , Spike Glycoprotein, Coronavirus/chemistry , United States/epidemiology , Unsupervised Machine Learning
15.
J Am Med Inform Assoc ; 27(9): 1443-1449, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32940694

ABSTRACT

OBJECTIVE: The genetic testing for hereditary breast cancer that is most helpful in high-risk women is underused. Our objective was to quantify the risk factors for heritable breast and ovarian cancer contained in the electronic health record (EHR), to determine how many women meet national guidelines for referral to a cancer genetics professional but have no record of a referral. METHODS AND MATERIALS: We reviewed EHR records of a random sample of women to determine the presence and location of risk-factor information meeting National Comprehensive Cancer Network (NCCN) guidelines for a further genetic risk evaluation for breast and/or ovarian cancer, and determine whether the women were referred for such an evaluation. RESULTS: A thorough review of the EHR records of 299 women revealed that 24 (8%) met the NCCN criteria for referral for a further genetic risk evaluation; of these, 12 (50%) had no referral to a medical genetics clinic. CONCLUSIONS: Half of the women whose EHR records contain risk-factor information meeting the criteria for further genetic risk evaluation for heritable forms of breast and ovarian cancer were not referred.


Subject(s)
Breast Neoplasms/genetics , Electronic Health Records , Genetic Testing , Ovarian Neoplasms/genetics , Breast Neoplasms/prevention & control , Female , Genetic Diseases, Inborn/diagnosis , Humans , Ovarian Neoplasms/prevention & control , Referral and Consultation , Risk Factors
16.
Diabetes ; 69(11): 2523-2535, 2020 11.
Article in English | MEDLINE | ID: mdl-32868339

ABSTRACT

HLA-DQA1 and -DQB1 genes have significant and potentially causal associations with autoimmune type 1 diabetes (T1D). To follow up on the earlier analysis on high-risk HLA-DQ2.5 and DQ8.1, the current analysis uncovers seven residues (αa1, α157, α196, ß9, ß30, ß57, and ß70) that are resistant to T1D among subjects with DQ4-, 5-, 6-, and 7-resistant DQ haplotypes. These 7 residues form 13 common motifs: 6 motifs are significantly resistant, 6 motifs have modest or no associations (P values >0.05), and 1 motif has 7 copies observed among control subjects only. The motifs "DAAFYDG," "DAAYHDG," and "DAAYYDR" have significant resistance to T1D (odds ratios [ORs] 0.03, 0.25, and 0.18; P = 6.11 × 10-24, 3.54 × 10-15, and 1.03 × 10-21, respectively). Remarkably, a change of a single residue from the motif "DAAYHDG" to "DAAYHSG" (D to S at ß57) alters the resistance potential, from resistant motif (OR 0.15; P = 3.54 × 10-15) to a neutral motif (P = 0.183), the change of which was significant (Fisher P value = 0.0065). The extended set of linked residues associated with T1D resistance and unique to each cluster of HLA-DQ haplotypes represents facets of all known features and functions of these molecules: antigenic peptide binding, peptide-MHC class II complex stability, ß167-169 RGD loop, T-cell receptor binding, formation of homodimer of α-ß heterodimers, and cholesterol binding in the cell membrane rafts. Identification of these residues is a novel understanding of resistant DQ associations with T1D. Our analyses endow potential molecular approaches to identify immunological mechanisms that control disease susceptibility or resistance to provide novel targets for immunotherapeutic strategies.


Subject(s)
Amino Acid Motifs/genetics , Diabetes Mellitus, Type 1/genetics , HLA-DQ Antigens/metabolism , High-Throughput Nucleotide Sequencing/methods , Amino Acid Sequence , Gene Expression Regulation , Genetic Predisposition to Disease , HLA-DQ Antigens/genetics , Haplotypes , Humans , Models, Molecular , Protein Conformation
17.
Diabetes ; 69(7): 1573-1587, 2020 07.
Article in English | MEDLINE | ID: mdl-32245799

ABSTRACT

HLA-DQA1 and -DQB1 are strongly associated with type 1 diabetes (T1D), and DQ8.1 and DQ2.5 are major risk haplotypes. Next-generation targeted sequencing of HLA-DQA1 and -DQB1 in Swedish newly diagnosed 1- to 18 year-old patients (n = 962) and control subjects (n = 636) was used to construct abbreviated DQ haplotypes, converted into amino acid (AA) residues, and assessed for their associations with T1D. A hierarchically organized haplotype (HOH) association analysis allowed 45 unique DQ haplotypes to be categorized into seven clusters. The DQ8/9 cluster included two DQ8.1 risk and the DQ9 resistant haplotypes, and the DQ2 cluster included the DQ2.5 risk and DQ2.2 resistant haplotypes. Within each cluster, HOH found residues α44Q (odds ratio [OR] 3.29, P = 2.38 * 10-85) and ß57A (OR 3.44, P = 3.80 * 10-84) to be associated with T1D in the DQ8/9 cluster representing all ten residues (α22, α23, α44, α49, α51, α53, α54, α73, α184, ß57) due to complete linkage disequilibrium (LD) of α44 with eight such residues. Within the DQ2 cluster and due to LD, HOH analysis found α44C and ß135D to share the risk for T1D (OR 2.10, P = 1.96 * 10-20). The motif "QAD" of α44, ß57, and ß135 captured the T1D risk association of DQ8.1 (OR 3.44, P = 3.80 * 10-84), and the corresponding motif "CAD" captured the risk association of DQ2.5 (OR 2.10, P = 1.96 * 10-20). Two risk associations were related to GAD65 autoantibody (GADA) and IA-2 autoantibody (IA-2A) but in opposite directions. CAD was positively associated with GADA (OR 1.56, P = 6.35 * 10-8) but negatively with IA-2A (OR 0.59, P = 6.55 * 10-11). QAD was negatively associated with GADA (OR 0.88; P = 3.70 * 10-3) but positively with IA-2A (OR 1.64; P = 2.40 * 10-14), despite a single difference at α44. The residues are found in and around anchor pockets 1 and 9, as potential T-cell receptor contacts, in the areas for CD4 binding and putative homodimer formation. The identification of three HLA-DQ AAs (α44, ß57, ß135) conferring T1D risk should sharpen functional and translational studies.


Subject(s)
Diabetes Mellitus, Type 1/etiology , HLA-DQ Antigens/genetics , Adolescent , Amino Acid Motifs , Child , Child, Preschool , Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 1/immunology , Genetic Predisposition to Disease , HLA-DQ Antigens/chemistry , HLA-DQ alpha-Chains/genetics , Haplotypes , Humans , Infant , Risk
18.
PLoS One ; 15(1): e0226803, 2020.
Article in English | MEDLINE | ID: mdl-31999736

ABSTRACT

BACKGROUND: HIV vaccine trials routinely measure multiple vaccine-elicited immune responses to compare regimens and study their potential associations with protection. Here we employ unsupervised learning tools facilitated by a bidirectional power transformation to explore the multivariate binding antibody and T-cell response patterns of immune responses elicited by two pox-protein HIV vaccine regimens. Both regimens utilized a recombinant canarypox vector (ALVAC-HIV) prime and a bivalent recombinant HIV-1 Envelope glycoprotein 120 subunit boost. We hypothesized that within each trial, there were participant subgroups sharing similar immune responses and that their frequencies differed across trials. METHODS AND FINDINGS: We analyzed data from three trials-RV144 (NCT00223080), HVTN 097 (NCT02109354), and HVTN 100 (NCT02404311), the latter of which was pivotal in advancing the tested pox-protein HIV vaccine regimen to the HVTN 702 Phase 2b/3 efficacy trial. We found that bivariate CD4+ T-cell and anti-V1V2 IgG/IgG3 antibody response patterns were similar by age, sex-at-birth, and body mass index, but differed for the pox-protein clade AE/B alum-adjuvanted regimen studied in RV144 and HVTN 097 (PAE/B/alum) compared to the pox-protein clade C/C MF59-adjuvanted regimen studied in HVTN 100 (PC/MF59). Specifically, more PAE/B/alum recipients had low CD4+ T-cell and high anti-V1V2 IgG/IgG3 responses, and more PC/MF59 recipients had broad responses of both types. Analyses limited to "vaccine-matched" antigens suggested that some of the differences in responses between the regimens could have been due to antigens in the assays that did not match the vaccine immunogens. Our approach was also useful in identifying subgroups with unusually absent or high co-responses across assay types, flagging individuals for further characterization by functional assays. We also found that co-responses of anti-V1V2 IgG/IgG3 and CD4+ T cells had broad variability. As additional immune response assays are standardized and validated, we anticipate our framework will be increasingly valuable for multivariate analysis. CONCLUSIONS: Our approach can be used to advance vaccine development objectives, including the characterization and comparison of candidate vaccine multivariate immune responses and improved design of studies to identify correlates of protection. For instance, results suggested that HVTN 702 will have adequate power to interrogate immune correlates involving anti-V1V2 IgG/IgG3 and CD4+ T-cell co-readouts, but will have lower power to study anti-gp120/gp140 IgG/IgG3 due to their lower dynamic ranges. The findings also generate hypotheses for future testing in experimental and computational analyses aimed at achieving a mechanistic understanding of vaccine-elicited immune response heterogeneity.


Subject(s)
AIDS Vaccines/administration & dosage , Antibodies, Neutralizing/immunology , CD4-Positive T-Lymphocytes/immunology , HIV Antibodies/immunology , HIV Antigens/immunology , HIV Infections/prevention & control , HIV-1/immunology , AIDS Vaccines/immunology , Adolescent , Adult , Antibody Formation/immunology , Female , HIV Antibodies/metabolism , HIV Infections/epidemiology , HIV Infections/immunology , Humans , Male , South Africa/epidemiology , Thailand/epidemiology , Young Adult
19.
Am J Hum Genet ; 106(1): 112-120, 2020 01 02.
Article in English | MEDLINE | ID: mdl-31883642

ABSTRACT

Whole-genome sequencing (WGS) can improve assessment of low-frequency and rare variants, particularly in non-European populations that have been underrepresented in existing genomic studies. The genetic determinants of C-reactive protein (CRP), a biomarker of chronic inflammation, have been extensively studied, with existing genome-wide association studies (GWASs) conducted in >200,000 individuals of European ancestry. In order to discover novel loci associated with CRP levels, we examined a multi-ancestry population (n = 23,279) with WGS (∼38× coverage) from the Trans-Omics for Precision Medicine (TOPMed) program. We found evidence for eight distinct associations at the CRP locus, including two variants that have not been identified previously (rs11265259 and rs181704186), both of which are non-coding and more common in individuals of African ancestry (∼10% and ∼1% minor allele frequency, respectively, and rare or monomorphic in 1000 Genomes populations of East Asian, South Asian, and European ancestry). We show that the minor (G) allele of rs181704186 is associated with lower CRP levels and decreased transcriptional activity and protein binding in vitro, providing a plausible molecular mechanism for this African ancestry-specific signal. The individuals homozygous for rs181704186-G have a mean CRP level of 0.23 mg/L, in contrast to individuals heterozygous for rs181704186 with mean CRP of 2.97 mg/L and major allele homozygotes with mean CRP of 4.11 mg/L. This study demonstrates the utility of WGS in multi-ethnic populations to drive discovery of complex trait associations of large effect and to identify functional alleles in noncoding regulatory regions.


Subject(s)
Asian People/genetics , Black People/genetics , C-Reactive Protein/genetics , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , White People/genetics , Whole Genome Sequencing/methods , Cohort Studies , Gene Frequency , Genome-Wide Association Study , Humans , Linkage Disequilibrium
20.
J Glob Health ; 9(2): 0204279, 2019 Dec.
Article in English | MEDLINE | ID: mdl-31673351

ABSTRACT

BACKGROUND: Health information exchange (HIE) is frequently cited as an important objective of health information technology investment because of its potential to improve quality, reduce cost, and increase patient satisfaction. In this paper we examine the status and practices of HIE in six countries, drawn from a range of higher and lower income regions. METHODS: For each of the countries represented - China, England, India, Scotland, Switzerland, and the United States - we describe the state of current practice of HIE with reference to two scenarios: transfer of care and referral. For each country we discuss national objectives, barriers and plans for further advancing clinical information exchange. RESULTS: The countries vary widely in levels of adoption of EHRs, availability of health information in electronic form suitable for HIE, and in the information technology infrastructure to be used for transmission. Common themes emerged, however, including an expectation that information will be exchanged rather than gathered anew, the need for incentives to promote information exchange, and concerns about data security and patient confidentiality. CONCLUSIONS: Although the ability to transfer health information to where it is most needed is nearly always mentioned as an advantage of HIE adoption, there are wide differences in the degree to which this has been achieved to support the scenarios used in this study. Nevertheless, these differences indicate varying stages of progress along a comparable pathway, with similar barriers being identified in the countries described. In some cases, these have been partially surmounted while elsewhere work is needed. We reflect on contextual factors influencing the status and direction of HIE efforts in different global regions and their implications for progress.


Subject(s)
Health Information Exchange/statistics & numerical data , China , England , Humans , India , Scotland , Switzerland , United States
SELECTION OF CITATIONS
SEARCH DETAIL
...